22 research outputs found
Multi-party Poisoning through Generalized -Tampering
In a poisoning attack against a learning algorithm, an adversary tampers with
a fraction of the training data with the goal of increasing the
classification error of the constructed hypothesis/model over the final test
distribution. In the distributed setting, might be gathered gradually from
data providers who generate and submit their shares of
in an online way.
In this work, we initiate a formal study of -poisoning attacks in
which an adversary controls of the parties, and even for each
corrupted party , the adversary submits some poisoned data on
behalf of that is still "-close" to the correct data (e.g.,
fraction of is still honestly generated). For , this model
becomes the traditional notion of poisoning, and for it coincides with
the standard notion of corruption in multi-party computation.
We prove that if there is an initial constant error for the generated
hypothesis , there is always a -poisoning attacker who can decrease
the confidence of (to have a small error), or alternatively increase the
error of , by . Our attacks can be implemented in
polynomial time given samples from the correct data, and they use no wrong
labels if the original distributions are not noisy.
At a technical level, we prove a general lemma about biasing bounded
functions through an attack model in which each
block might be controlled by an adversary with marginal probability
in an online way. When the probabilities are independent, this coincides with
the model of -tampering attacks, thus we call our model generalized
-tampering. We prove the power of such attacks by incorporating ideas from
the context of coin-flipping attacks into the -tampering model and
generalize the results in both of these areas
Bounding Training Data Reconstruction in DP-SGD
Differentially private training offers a protection which is usually
interpreted as a guarantee against membership inference attacks. By proxy, this
guarantee extends to other threats like reconstruction attacks attempting to
extract complete training examples. Recent works provide evidence that if one
does not need to protect against membership attacks but instead only wants to
protect against training data reconstruction, then utility of private models
can be improved because less noise is required to protect against these more
ambitious attacks. We investigate this further in the context of DP-SGD, a
standard algorithm for private deep learning, and provide an upper bound on the
success of any reconstruction attack against DP-SGD together with an attack
that empirically matches the predictions of our bound. Together, these two
results open the door to fine-grained investigations on how to set the privacy
parameters of DP-SGD in practice to protect against reconstruction attacks.
Finally, we use our methods to demonstrate that different settings of the
DP-SGD parameters leading to the same DP guarantees can result in significantly
different success rates for reconstruction, indicating that the DP guarantee
alone might not be a good proxy for controlling the protection against
reconstruction attacks
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation
Recent works have demonstrated that deep learning models are vulnerable to
backdoor poisoning attacks, where these attacks instill spurious correlations
to external trigger patterns or objects (e.g., stickers, sunglasses, etc.). We
find that such external trigger signals are unnecessary, as highly effective
backdoors can be easily inserted using rotation-based image transformation. Our
method constructs the poisoned dataset by rotating a limited amount of objects
and labeling them incorrectly; once trained with it, the victim's model will
make undesirable predictions during run-time inference. It exhibits a
significantly high attack success rate while maintaining clean performance
through comprehensive empirical studies on image classification and object
detection tasks. Furthermore, we evaluate standard data augmentation techniques
and four different backdoor defenses against our attack and find that none of
them can serve as a consistent mitigation approach. Our attack can be easily
deployed in the real world since it only requires rotating the object, as we
show in both image classification and object detection applications. Overall,
our work highlights a new, simple, physically realizable, and highly effective
vector for backdoor attacks. Our video demo is available at
https://youtu.be/6JIF8wnX34M.Comment: 25 page